Quick Calculation for Sample Size while Controlling False Discovery Rate with Application to Microarray Analysis

نویسندگان

  • Peng Liu
  • J. T. Gene Hwang
چکیده

1 Summary. Sample size estimation is important in microarray or proteomic experiments since biologists can typically afford only a few repetitions. Classical procedures to calculate sample size are based on controlling type I error, e.g., family-wise error rate (FWER). In the context of microarray and other large-scale genomic data, it is more powerful and more reasonable to control false discovery rate (FDR) or positive FDR (pFDR)(Storey and Tibshirani, 2003). However, the traditional approach of estimating sample size is no longer applicable to controlling FDR, which has left most practitioners to rely on haphazard guessing. We propose a procedure to calculate sample size while controlling false discovery rate. Two major definitions of the false discovery rate (FDR in Benjamini and Hochberg, 1995, and pFDR in Storey, 2002) vary slightly. Our procedure applies to both definitions. The proposed method is straightforward to apply and requires minimal computation, as illustrated with two sample t-tests and F-tests. We have also demonstrated by simulation that, with the calculated sample size, the desired level of power is achievable by the q-value procedure (Storey, Taylor and Siegmund, 2004) when gene expressions are either independent or dependent.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Gene expression: Quick calculation for sample size while controlling false discovery rate with application to microarray analysis

MOTIVATION Sample size calculation is important in experimental design and is even more so in microarray or proteomic experiments since only a few repetitions can be afforded. In the multiple testing problems involving these experiments, it is more powerful and more reasonable to control false discovery rate (FDR) or positive FDR (pFDR) instead of type I error, e.g. family-wise error rate (FWER...

متن کامل

Sample Size Estimation while Controlling False Discovery Rate for Microarray Experiments Using the ssize.fdr Package

Microarray experiments are becoming more and more popular and critical in many biological disciplines. As in any statistical experiment, appropriate experimental design is essential for reliable statistical inference, and sample size has a crucial role in experimental design. Because microarray experiments are rather costly, it is important to have an adequate sample size that will achieve a de...

متن کامل

Sample size for FDR-control in microarray data analysis

We consider identifying differentially expressing genes between two patient groups using microarray experiment. We propose a sample size calculation method for a specified number of true rejections while controlling the false discovery rate at a desired level. Input parameters for the sample size calculation include the allocation proportion in each group, the number of genes in each array, the...

متن کامل

The False Discovery Rate in Simultaneous Fisher and Adjusted Permutation Hypothesis Testing on Microarray Data

Background and Objectives: In recent years, new technologies have led to produce a large amount of data and in the field of biology, microarray technology has also dramatically developed. Meanwhile, the Fisher test is used to compare the control group with two or more experimental groups and also to detect the differentially expressed genes. In this study, the false discovery rate was investiga...

متن کامل

False discovery rate, sensitivity and sample size for microarray studies

MOTIVATION In microarray data studies most researchers are keenly aware of the potentially high rate of false positives and the need to control it. One key statistical shift is the move away from the well-known P-value to false discovery rate (FDR). Less discussion perhaps has been spent on the sensitivity or the associated false negative rate (FNR). The purpose of this paper is to explain in s...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2005